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Abstract 

Recent literature has shown that the control of False Discovery Rate (FDR) for distributed detection 
in wireless sensor networks (WSNs) can provide substantial improvement in detection performance over 
conventional design methodologies. In this paper, we further investigate system design issues in FDR 
based distributed detection. We demonstrate that improved system design may be achieved by employing 
the Kolmogorov-Smirnov distance metric instead of the deflection coefficient, as originally proposed in 
|[T]. We also analyze the performance of FDR based distributed detection in the presence of Byzantines. 
Byzantines are malicious sensors which send falsified information to the Fusion Center (FC) to deteriorate 
system performance. We provide analytical and simulation results on the global detection probability as 
a function of the fraction of Byzantines in the network. It is observed that the detection performance 
degrades considerably when the fraction of Byzantines is large. Hence, we propose an adaptive algorithm 
at the FC which learns the Byzantines' behavior over time and changes the FDR parameter to overcome 
the loss in detection performance. Detailed simulation results are provided to demonstrate the robustness 
of the proposed adaptive algorithm to Byzantine attacks in WSNs. 
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I. Introduction 

In recent years, wireless sensor networks (WSNs) have been extensively employed to monitor a region 
of interest (ROI) for reliable detection/estimation/tracking of events Q |[3| Q ||5|. In this paper, we 
focus on distributed target detection in WSNs, which has been a very active area of research in the recent 
past. In distributed detection Q, due to power and bandwidth constraints, each sensor, instead of sending 
their raw data, sends quantized data (local decision) to a central observer or Fusion Center (FC). The FC 
combines these local decisions based on a fusion rule to come up with a global decision. There has been 
extensive research on distributed detection with fusion of local decisions |[6|. Optimum fusion rules have 
been derived for the distributed detection problem under various assumptions Q ||8| |[9| |10| |11|. Most 
of these fusion rules require complete knowledge of the local sensor performance metrics, such as the 
probability of detection and false alarm. However, in large wireless sensor networks and under complex 
target signal models, the local sensor performance metrics may not be known or may be very difficult to 



estimate. To address the scenario of unknown local sensor performance metrics, in 1 12| 1 13 1, the authors 
have proposed employing the total number of detections (also referred to as the "count statistic') as a 
decision statistic at the FC. The fusion rule based on the count statistic leads to a decision rule where 
the sensors decisions are weighed equally, even though the SNR at each sensor may be different. 

Under fairly general conditions, obtaining the optimal local decision rules have been shown to be a 
very difficult problem Q |[T4|. Under the conditional independence assumption, it has been shown that 
identical local quantizers is optimal under asymptotic conditions (i.e., the number of sensors N — oo) 
p3| . Although the optimality of identical quantizers does not hold in general 1 15 1 |[6| 1 14|, design of non- 
identical quantizers is computationally very complex and researchers have generally employed identical 
quantizers based on asymptotic optimality of identical quantizers. Recently, False Discovery Rate (FDR) 



based distributed detection has been proposed by Ermis and Saligrama 1 16 1 and Ray and Varshney | |17| 
||T|. In ||T|, the authors have proposed employing non- identical thresholds for distributed detection in 
WSNs based on the control of FDR. It has been shown that under the assumption that the FC employs a 
test statistic which is linear in count (count here refers to the total number of detections) to reach the global 
decision, control of the FDR leads to non-identical local decision rules. This scheme provides significant 
improvement in the global detection performance |[T|. In |[T|, the authors suggest maximization of the 
deflection coefficient to obtain the FDR design parameter. However, as the count statistic is non-Gaussian 
in general, maximization of the deflection coefficient does not guarantee optimal global performance |18| |. 
In this paper, we further analyze system design for FDR based distributed detection and demonstrate that 
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improved system design may be achieved via optimization of the Kolmogorov-Smimov distance instead 
of the deflection coefficient. 

Most of the research in the field of distributed detection has been carried out under the assumption of 
a secure network. Only in the recent past, researchers have investigated the problem of security threats 



1 19 1 on sensor networks. In this paper, we focus on one particular class of security attacks, known as 



the Byzantine attack |20| |21| |22| |23| (also referred to as the Data Falsification Attack). Byzantine 
attack involves malicious sensors within the network which send false information to the FC, to disrupt 
the global decision making process. Byzantines intend to deteriorate the detection performance of the 
network by suitably modifying their decisions before transmission to the FC. Marano et al. |20] first 
considered the distributed detection problem in the presence of Byzantines under the assumption that the 



Byzantines have perfect knowledge of the underlying true hypothesis. In |20|, the authors also presented 
the optimal attacking distributions for the Byzantines such that the detection error exponent is minimized 
at the FC. Rawat et al. in [21] considered the more practical case when the Byzantines did not have the 
knowledge of the underlying true hypothesis and also proposed a simple algorithm at the FC to identify 



the Byzantines in the network. In [24], stochastic resonance [25 1 |26| was employed to mitigate the 



effect of Byzantines in a distributed inference network. In our previous work [ |23| , we have analyzed 
localization in WSNs in the presence of Byzantines and proposed mitigation techniques to make the 
Byzantines 'ineffective'. In this paper, we study the performance of the FDR based distributed detection 
framework in the presence of Byzantines. It is observed that the global detection performance deteriorates 
rapidly when the fraction of Byzantine sensors increases. Hence, we propose a novel algorithm at the FC, 
based on a modified Kolmogorov goodness-of-fit test, which detects the fraction of Byzantines present 
in the network and adaptively changes the FDR parameter to improve the detection performance. 
The key contributions of this paper are summarized as follows: 

• We propose maximization of the Kolmogorov-Smirnov distance instead of the deflection coefficient to 
obtain the FDR design parameter and demonstrate that it considerably improves system performance. 

• We define a Byzantine attack model and show that the FDR value is controlled even in the presence 
of Byzantines; however the local sensor detection performance deteriorates considerably when the 
fraction of Byzantines is large. 

• We next study the performance of FDR based distributed detection in the presence of Byzantine 
attacks and provide analytical and simulation results on the effect of Byzantines on global detection 
performance. 

• Finally, we propose an algorithm which adaptively changes the system parameters by learning the 



DRAFT 



IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS (DRAFT) 



4 



Byzantines' behavior over time and demonstrate that the proposed algorithm provides improved 
system performance in the presence of Byzantines. 
The remainder of the paper is organized as follows: In Section |llj we introduce the system model and 
lay out the assumptions made in the paper. We also formally define False Discovery Rate (FDR) and 
briefly discuss the FDR based distributed detection scheme proposed in We propose some changes 
to the system design algorithm proposed in [1] and show the improvement in system performance in 



Section III In Section IV we show the performance degradation of FDR based schemes in the presence 
of Byzantines. We show that although the FDR value is maintained at the specified value, the power of 
the test reduces. We provide analytical results on the performance of FDR based distributed detection in 
the presence of Byzantines in Section |V] We propose an approximation to the optimal parameter design 
approach which is computationally efficient and building on it, propose the adaptive distributed detection 



scheme in Section VI We conclude with a discussion on possible future directions in Section VII 

II. Preliminaries 

A. System Model 

We consider a parallel fusion network where N sensors are randomly deployed in the Region of 
Interest (ROI). Each sensor receives noisy target signals, Si (for i = 1,2, ...,N) and makes a decision bi 
regarding the presence/absence of the target, which is then transmitted to the FC. The FC makes a global 
decision (bo £ {1,0}) regarding the presence/absence of the target using the transmitted local decisions 
{^j}j=i N- assume that the channels between the local sensors and the FC are ideal (for results 



on distributed detection with imperfect channels, see |27|, |28|, and references therein). We consider the 
presence of M = aN (0 < q < 1) Byzantines in the network. These Byzantines' aim is to send falsified 
information to the FC and deteriorate the detection performance. Their model and attack strategy would 
be described later in Section UV] 

As discussed earlier, due to the bandwidth and energy constraints, each local sensor sends a binary 
decision (0/1) to the FC based on a local hypothesis test. The local sensor's hypothesis test can be 
formulated as follows: 

Ho : Si = Hi : Target absent (1) 
Hi : Si = tti + iii : Target present (2) 

where = is the signal amplitude received at the i^^ sensor due to the presence of the target and 
Ui G AA(0, 1) represent i.i.d. Gaussian noise. In this paper, we assume that the signal power received due 
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to that emitted by the target drops to zero outside its finite radius of influence (do). The general signal 
model used is 

Pi = g{di) (3) 

where Pi is the signal power received at the i^^ sensor which is at a distance di from the target. As 
discussed above, the following g{-) has been used in this paper: 

Po, if < X < do 



gyx) 



(4) 

0, if x > do 



The above model is adopted primarily for analytical convenience; however the results provided in this 
paper may be easily extended to more complex target signal models, such as where the target signal 
decays exponentially or in an inverse square manner with distance. 

The FC makes a global decision based on the vector of local decisions 6 = • ' " > ^A^) received from 
all the sensors. The binary hypothesis problem at the FC is 

Go : Go) : Target absent (5) 

Gi : P{h] Gi) : Target present (6) 

where, P{b; Gq) and P{b; Gi) are the distributions of b in the absence of the target (Go) in the ROI and 
in the presence of the target (Gi) in the ROI respectively. 

Conventionally, for the distributed detection problem, identical decision rules are used at the local 
sensors. However in |[l|, FDR based non-identical sensor decision rules have been proposed and shown 
to be superior to identical decision rules. In the next subsections, we introduce the concept of FDR and 
provide a brief description of FDR based distributed detection. 

B. False Discovery Rate (FDR) 

In statistical hypothesis testing, when a family of tests (for e.g., multiple binary hypothesis tests) are 
conducted, it is often meaningful to define an error rate for the family of tests instead for an individual 



test. Family wide error rate (FWER) 1 29 1 is perhaps the most popular error rate used in the literature. It 
is defined as the probability of committing any type I error or false alarm. If the error rate for each test 
is [3 then the FWER (3f for k tests is 

/3f = P{F>1) = 1-{1- pf (7) 

where F is the total number of false alarms. As can be seen from ([7]), as the number of tests k increases, 
/3 remains constant but jip increases. This is a fundamental problem in Multiple Comparison Procedures 
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(MCP) and classical comparison procedures aim to control this enor measure. Bonferroni procedure |30| 
is a widely employed procedure to control the FWER at a desired rate, but it results in significantly 
reduced power (probability of detection). A radically different and more liberal approach proposed by 



Benjamini and Hochberg |31| controls the FDR, defined as the fraction of false rejections among those 
hypotheses rejected. Formally, FDR is defined as the expected ratio of the number of false alarms (declared 
Hi when Hq is true) to the total number of detections (Hi declarations consisting of both true and false 
detections). 



TABLE I: Notations to define FDR 





Declared Ho 


Declared Hi 


Total 


Ho true 


W 


F 


iVo 


Hi true 


T 


S 


iV-7Vo 


Total 


N-R 


R 


N 



From Table I, the ratio of false alarms to the total number of detections can be viewed as the random 
variable, 

Q={^ (8) 
[o ifF + S = 

FDR (Qe) is defined to be the expectation of Q, 

Qe = E{Q) (9) 



This metric was proposed by Benjamini and Hochberg | ,31J along with the following centralized 
algorithm to control FDR for multiple comparisons. 

Algorithm to control FDR: Suppose pi,p25 •" " ,PAr are the p- values for tests and p(i),p(2); " ^P(N) 
denote the ordered p-values. The p-value for an observation si is defined as 



K= / k{s)ds (10) 

J Si 

where, /o(s) is the probability density function of the observation under Hq. The algorithm by Benjamini 



and Hochberg 1 3 1 1 which keeps the FDR below a value 7, is provided below. 
1. Calculate the p-values of all the observations and arrange them in ascending order. 



DRAFT 



IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS (DRAFT) 



7 



2. Let d be the largest i for which < i^/N. 

3. Declare all observations corresponding to i = 1, • • • , d, as i^i. 

Under the assumption of independence of test statistics corresponding to the true null hypotheses (Hq), 
this procedure controls the FDR at 7. Note that the FDR based decision making system looks for the 
largest index i = d such that p(^) < d^y/N. There may be other indices i = I, where / < d for which the 
condition < l'^ /N may be true, but the FDR-based decision system looks for the largest value of i 
for which this is true. The reason behind this, as discussed in (31] and subsequently pointed out in 
is to achieve the largest probability of detection while constraining the FDR to less than or equal to 7. 
Further discussion including the proof of this algorithm is omitted for the brevity of the paper and may 



be found in 1 3 1 1 . 



The above algorithm requires the ordering of p-values and the procedure conventionally needs central- 
ized processing. For the distributed detection problem considered in this paper, the sensors can only send 
one bit to the FC and hence a distributed ordering scheme is necessary. A decentralized FDR procedure 
has been proposed in yj that achieves the same performance as the Benjamini and Hochberg procedure. 
This algorithm is based on the fact that the only information required by the FC is the number of Hi 
declarations (henceforth referred to as the count statistic. A). The algorithm and further discussion may 
be found in |[T|. 

III. System Design 

In this section, we summarize the design guidelines for FDR based distributed detection proposed in 
|[T| and show that the system design aspects need to be re-visited. The number of design parameters 
in a FDR based distributed detection system |[T| are 7 and TpDR, where 7 is the FDR parameter and 
TpDR is the global threshold. For the sake of comparison, we also study system design for an identical 
threshold scheme, where the design parameters are r and T/y, where r is the local observation threshold 
parameter {Q{t) = pjais the threshold on the p-values) and Tjt is the global threshold. The system-wide 
probability of false alarm for the FDR based distributed detection system is given by, 

Pfa = P(A > Tfdr; Go) + kP{A = Tfdr\ Go) (11) 
where k is the randomization parameter. Similarly, the system-wide probability of detection is given by 

Pd = P(A > Tfdr; Gi) + kP{A = Tfdr; Gi) (12) 

where the p.m.f of the count statistic is given by Propositions 2, 4, 5 and 6 of |1|. The system-wide 
performance metrics for the identical threshold scheme may be obtained similarly. 
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It has been shown in |jT| that the choice of optimal FDR parameter 7 or identical decision threshold r, 
where optimality is with respect to system-level detection performance is a difficult problem. Optimizing 
the ROC via simulation or numerical computation is very complex (see [1] for further discussion). A 
computationally less intensive approach is to use distance measure based optimization for system design. 
Motivated by this, in |1|, optimization of deflection coefficient of the count statistic (A) was proposed 
to find the optimal 7 or r. Intuitively increased deflection coefficient generally implies greater separation 
between the pmfs of the count statistic under global hypothesis Go and Gi and is likely to lead to better 
detector design. Also, the distribution of the count statistic under asymptotic conditions is Gaussian 
for which it is known that maximization of the deflection coefficient leads to an optimal detector fT8|. 
However, in this paper, we show that maximization of the deflection coefficient may not be the best 
design criterion for FDR based detection system under non-asymptotic conditions. Since the distribution 
of the count statistic (A) is non-Gaussian in general, it is likely that the deflection coefficient fails to 
characterize its performance completely. We study the performance of several candidate distance measures 
(such as KuUback-Leibler Divergence, Bhattacharya Distance and Kolmogorov-Smirnov Distance) for 
system design. 

As a motivating example, we perform the following simulation. Let A'^ = 20 sensors be randomly 
distributed within the circular ROI of radius R = W. The pdf of the sensor locations r (r is measured 
from the target location) is given by /i?(r) = 2r/i2^,0 < r < R. The optimal values of parameters 7 
and r obtained for a target model with power Pq = 5 and radius of influence do = 5 depend on the 
system- wide probability of false alarm, Pfa- For PpA = 0.1, the optimal parameters that maximize Pd 
are found to be jopt = 0.25 and Topt = (5^^(0.005). However, when deflection coefficient is used, we 
get the optimal parameters as 7^^^ = 0.008 and r^p^ = (5^^(0.00005). The optimization has been done 
numerically through simulations and the plots have been omitted for the sake of brevity of the paper. 

The above example shows that the deflection coefficient does not always yield the optimal value 
of the parameters and as shown can deviate substantially from the optimal. This makes it necessary 
to come up with a distance measure which is computationally more efficient than using ROC based 
optimization, and provides a better approximation to the optimal solution. In this paper, we present 
some comparative performance results between four candidate measures: Deflection Coefficient |T2], 



KuUback-Leibler Divergence |32|, Bhattacharya Distance and Kolmogorov-Smirnov Distance |33|. Table 
II compares the optimal parameter values found by the optimization of each of the distance measures 
against the true optimal found by ROC based optimization (Pd maximization while fixing PpA = 0.1) 
for both the schemes: FDR based threshold design and identical threshold design. 
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TABLE II: Perfomiance Comparison using different distance measures 



Metrics 


FDR based threshold scheme (7) 


Identical threshold scheme (Q(r) — pja) 


ROC based optimization 


0.25 


0.005 


Deflection Coefficient optimization 


0.008 


0.00005 


Kullback-Leibler Divergence optimization 


0.1 


0.00005 


Bhattacharya Distance optimization 


0.2 


0.01 


Kolmogorov-Smirnov Distance optimization 


0.2 


0.005 



For the example considered, as can be seen from Table [TTj the Kolmogorov-Smirnov Distance (KSD) 
best approximates the optimal parameters compared to other candidate measures for both the FDR based 
scheme and the identical threshold scheme. We also noted a similar result for a wide range of simulation 
parameters. The deflection coefficient fails to find the optimal parameters as it only looks at the first 
and the second moments of the distributions which is not sufficient when the underlying distribution is 
non-Gaussian. In our context, the count statistic (A) is non-Gaussian and is a discrete RV and, therefore, 
deflection coefficient does not characterize the distribution of count statistic. In the remainder of this 
section, we formally define the Kolmogorov-Smirnov distance and perform some empirical studies based 
on K-S distance to find the optimal parameter values for distributed detection schemes (both FDR based 
scheme and identical threshold scheme). 

Definition The Kolmogorov-Smirnov distance (K-S distance) is defined as the maximum value of the 



absolute difference between two cumulative distribution functions |33|. 



The Kolmogorov Smirnov measures the distance between the empirical distribution function of the 
sample data and the cumulative distribution function of the reference distribution. The K-S distance is 
non-parametric and distribution free, i.e., it makes no assumption about the underlying data distribution. 
The K-S distance between two cdfs F{-) and G(-) is defined as 

KSD{F{-),G{-))= sup \F{x)-G{x)\ (13) 

0<x<l 

In our context, F(-) and G{-) represent the cdfs of the count statistic (A) under global hypotheses Go 
and Gi. The optimal parameter value of the local decision threshold parameter (7 for FDR based scheme 
or pfa for identical threshold scheme) is found by maximizing the K-S distance. 
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It is important to note here that the optimal parameter value depends on the system requirement of 
system-wide probability of false alarm (Pfa)- When PpA = 0.2 is used, the optimal parameter values 
based on ROC optimization come out to be 7 = 0.2 and Pfa = 0.01 which again are close to the optimal 
parameter values found by using the K-S distance measure for optimization. It is important to note here 
that although the optimal parameter value changes, the FDR based decision threshold scheme proposed 
in |[T| outperforms the identical threshold scheme even when K-S distance is used as the distance metric 
as seen in Fig. [T] Fig. [T] shows the ROC obtained by using the parameters obtained by K-S distance 
optimization instead of deflection coefficient optimization. Note that the ROCs obtained by deflection 
coefficient optimization in |J^| are not as good as the ROCs obtained in Fig. [T] using K-S distance 
optimization. 



ROC 

1 I 1 1 




Fig. 1 : ROC using the optimal parameter values found using K-S distance 



From empirical studies, K-S distance seems to provide a good approximation to the optimal. We 
hypothesize that since K-S distance measures the overlap between the distributions and is non-parametric 
in nature, it provides better results compared to deflection coefficient maximization. However, to find the 
best distance measure to use is an interesting but difficult problem to solve considering the large number 



of candidate measures available in the literature (Ali-Silvey Distance Measures |34|). We leave this for 
future work and for the remaining part of the paper, we use K-S distance or ROC based optimization to 
find the optimal local threshold parameters. 
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IV. Control of FDR in the presence of Byzantines 

In this section, we consider the problem of FDR based distributed detection in the presence of 
Byzantines. We first show the effect of Byzantines on the control of FDR in multiple comparison problems 
and in subsequent sections analyze the effect of Byzantines on FDR based distributed detection. 

Byzantines are the local sensors which send falsified information to the FC to deteriorate system 
performance. Since control of FDR is based on p- values, the Byzantines' attack strategy would be to 
report a falsified p-value denoted by qi = h{pi), where h{-) is a transformation used by the Byzantines 
and Pi is the true p-value of the i^^ sensor. The transformation h{-) needs to satisfy the properties listed 
below: 

1. h{-) should be a function whose domain and range are [0, 1] or /i : [0, 1] — [0, 1] 

2. h{-) should be a decreasing function. This property is essential since the Byzantines' aim is to 
deteriorate the detection performance [20|. As the p-value represents the 'confidence' of the target 
being absent, they would like to report falsified information by reversing it. A decreasing function 
ensures a higher q-value for a lower p-value and vice-versa. 

One possible transformation is the linearly decreasing function h{p) = I — p, which is equivalent to 
flipping of the local decision. This transformation has been shown to be the optimal attack for Byzantines 



in distributed detection using identical local thresholds |21| and target localization with quantized data 



|23|. In the remainder of this paper, we use the above transformation to model the Byzantine attack. 
Therefore, if pi represents the true p-value of the i^^ sensor, then the reported p-values are given by 

Pi, if i^^ sensor is honest 

(14) 

1 — Pi, if i^^ sensor is Byzantine 
In the rest of the section, we show the effect of Byzantines on the control of FDR. As mentioned 
previously, an important aspect of the FDR based detection is the control of FDR value at the pre- 



determined threshold 7. The FDR control algorithm provided earlier in Section |II-B[ controls the FDR 
value at 7 when the true Hypothesis is Hq and at a value less than or equal to 7 when the true Hypothesis 
is Hi. We now prove that the FDR value is controlled even in the presence of Byzantines. 

Proposition 4.1: Let N sensors be randomly deployed in the ROI. The local sensors report local deci- 
sions to the FC using their p-values and the FC makes the final decision regarding the presence/absence 
of the target using these local decisions. Let there be M Byzantines in the network which transform the 



p-values according to ( 14 1 and report decisions based on the falsified p-values. For independent local 
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sensors under global hypothesis Gq, the FDR control algorithm proposed in pT| control the FDR value 
at the pre-determined threshold 7, even in the presence of Byzantines. 

Proof: In order to prove this proposition, we need to find the distribution of p-values under both the 
hypotheses in the presence of Byzantines. The Byzantines' model used in this paper is that of transforming 
the p-values according to ([14]). For a Gaussian random variable N{cl), 1), the density of the p-value 
is given by Ui 

f^{u) = exp(-^) exp((/>Q-i(n)), 0<u<l (15) 

From this result, under null hypothesis Hq, the true p-values follow uniform distribution (</> = 0). Since 
the Byzantines' effect is a transformation given by ([14]), the distribution of the reported p-values under 
Hq can be found by the transformation of random variables V = 1 — U. Due to probability netgral 
transform, the reported p-values also follow uniform distribution under null hypothesis. 

The true p-value under the alternate hypothesis Hi follows the distribution given by ( [TS] ) with (/) = 
\J (Pq)- For the Byzantines, the reported p-value is given by g = 1 — p. Using the change of variables, 
V = \-\J, 

f<i>[v) = f^{u)\u=l-t 



du 
dv 



(16) 



This gives us the distribution of the reported p-values fv{v) as 



0{v) = exp(-^) exp(0Q-i(l - v)) for < 7; < 1 (17) 

where (j) is the received signal amplitude at the local sensor which is either or a/ {Pq) under hypothesis 
Hq and Hi respectively. 

The proof of this proposition then follows from the straightforward observation that the FDR algorithm 



proposed by Benjamini and Hochberg |31 1 controls the FDR value for any configuration of the false null 
hypothesis. The condition that p-values are uniformly distributed under true null hypothesis is still satisfied 
and since the proof does not depend on the p-values' distribution under Hi, the FDR value is controlled 
at the pre-determined threshold. ■ 
We now show via simulations that the FDR value is controlled at 7 even in the presence of Byzantines. 
Figs. [2] and [3] show the FDR value with varying fraction of Byzantines (a) present in the network. Let 
us consider a distributed detection system with the parameters: N = 20, R = 10, do = 5, Pq = 5 and 
the FDR parameter 7 = 0.25. The simulation results are for 5 X 10"^ Monte-Carlo runs. As can be seen 
from the figures, the FDR value is maintained at 7 under Gq and at a value less than or equal to 7 under 
Gi even in the presence of Byzantines. 
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Fig. 2: FDR value against the fraction of Byzantines when the true hypothesis is Go 



— FDR value in tfie presence of Byzantines 
— I — FDR tfirestiold 




0.2 0.4 0.6 

Fraction of Byzantines (a) 



Fig. 3: FDR value against the fraction of Byzantines when the true hypothesis is Gi 



This ineffectiveness of Byzantines on the FDR value is expected because the effect of Byzantines 
changes the order of the reported p-values (q-values) and the largest threshold crossing index would be 
different as compared to the largest threshold crossing index on the original p-values. In other words, in 
the presence of a target, most of the true p-values are small and therefore the threshold crossing would be 



closer to the right extremal resulting in a high number of detections as depicted by an example in Fig. 4a 



In the presence of Byzantines, the p-values get transformed as defined by ([14]) and the reported p-values 



become larger and the threshold crossing shifts to the left, as shown in Fig. 4b This reduces the number 



DRAFT 



IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS (DRAFT) 



14 



0.8 - 
0.7 - 

I 0.5- 

Q. 

0.4 - 
0.3 - 
0.2 - 
0.1 - 



+ True p-values 
Linearly increasing thresliold 



Tliresliold Crossing 




-X) — © — © — © — © — © — © — © — © — 9 



o Reported false p-values 
Linearly increasing threshold 



Threshold 
Crossing 
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Fig. 4: p-values against linearly increasing threshold in the presence of target 



of detections and it is equivalent to looking at an earlier threshold crossing index on the true p-values. 



As pointed out in Section II-B the FDR algorithm looks at the largest index satisfying < i"^ /N to 
maximize the power of the test. Observe that when a = 1, i.e., all the sensors are Byzantines, the order 
of the q-values is reversed as compared to the p-values and the FDR algorithm based on the q-values 
ends up looking at the smallest index of p-values satisfying < i^/N rather than the largest index. 
Under this observation, we conjecture that for < a < 1, the FDR algorithm based on the q-values ends 
up looking at an index of p-value between the largest and the smallest indices satisfying < i^/N. 
Since, the proof of control of FDR does not depend on whether it is the largest threshold crossing index 
or not, the FDR value is maintained at the required threshold. However, the power of the test degrades 
in the presence of Byzantines. 

In the above discussion, we have shown that the FDR value is not affected by the presence of 
Byzantines. However, it was also pointed out that the FDR control algorithm for the reported q-values is 
equivalent to looking at an earlier index on the true p-values. In the presence of Byzantines, the number 
of detections is reduced and, therefore, the distribution of the count statistic (number of detections) under 
Gi is now closer to the distribution of count under Gq which remains unchanged in the presence of 
Byzantines. This makes it difficult to distinguish between the two hypotheses. 

In the following section, we explore this intuitive observation that the Byzantines bring the distribution 
of the count statistic under Gq closer to its distribution under Gi in the context of FDR based distributed 
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detection. We show analytically the effect of Byzantines on the count statistic and derive the distributions 
of the count statistic under Go and Gi. 

V. FDR BASED Distributed detection in the presence of Byzantines 

In order to understand the behavior of FDR based distributed detection scheme in the presence of 
Byzantines, we require the knowledge regarding the p.m.f. of the count statistic (A) in the presence of 
Byzantines. These results are provided in the rest of this section. 

A. Probability mass function of Count (A) 

Let the observed p-values be denoted by the random variables ^ and the reported p- values 

(q-values) be denoted by the transformed random variables {Vi}^^^ ^ where the transformation is as 
follows 

{Ui if i is an honest sensor 

(18) 
1 — Ui if Hs a Byzantine sensor 

Proposition 5.1: The probability of A= i local false alarms (count) for N sensors containing M = aN 
Byzantines in the ROI, and control of FDR at 7 under Go (absence of target in the ROI) is given by 

In the absence of a target, the p-values of both the Byzantines and the honest sensors are uniformly 
distributed, that is both Ui and Vi are uniformly distributed. Therefore, the result remains the same as 
derived by Finner and Roters [35] using Dempster's formula for barrier crossing for uniform random 
variables, irrespective of the presence of Byzantines. Similarly, the asymptotic distribution can be found 
as 

lim P{A = i;Go) = ^(1 - 7)7^ exp (-i7) (20) 

Proposition 5.2: The probability of A = i local detections (count) for sensors containing M = aN 
Byzantine sensors in the ROI, and control of FDR at 7 under Gi (presence of target in ROI) is given by 



1 /-l fVi+2,N ri-y/N 

P(A = ^;Gl) = -^5;/ •••/ / 

\^.rl JvMM='y JVi., „ = (((i4-l)/N)"/) Jv 



V2,N 

N\fv,{vi)fv2{v2) ■ ■ ■ fvN{'^N)dvi^Ndv2,N ■ ■ ■ dVN,N (21) 

1Jl,N=0 
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Also, asymptotically, i.e., for large N, 

min {M,i) 



llilll yiv± j , s / TV T 11 r\ 

p(A = G,) = ^ r) r -_ ^) ipmi - P%r-\P%y-'{i - pd 



A:=max (0,M-N+i) 



(22) 



or 



P(A = i; Gi) = (^^^ {PDYil - Pof-' (23) 

where is the average probability of reporting '1' under Gi for an honest local sensor, p^ is the 
average probability of reporting '1' under Gi for a Byzantine local sensor and pD = ap^ + (1 — 0!)Pd 
is the average probability of reporting '1' under Gi. 
Proof: The proof is provided in Appendix |A] 

■ 

It is interesting to observe here that the Byzantines only affect the p.m.f of the count statistic (A) 
under Gi, while the p.m.f under Go remains the same. The reason behind this is that the p-values under 
Go are uniformly distributed and a transformation h{p) = I — p = q still keeps the q- values uniformly 
distributed under Go. However, the p.m.f under Gi changes as the p-values are no longer uniformly 
distributed. 



B. Numerical Results 

In this section, we provide numerical and simulation results to validate the analytical expressions 
obtained for the p.m.f of the count statistic under FDR based threshold design. Tables Illj IV and [V] 
show the numerical and simulation results of analytically derived P(A = i;Gi) for FDR based scheme 
for different values of a. The signal and target parameters are: Pq = 3, R = 10, do = 3, N = 4 and 



FDR parameter 7 = 0.1. The integrals given in ( |2T| ) have been evaluated using Monte Carlo integration 
methods [36]. The simulation results are for 5 x 10^ Monte Carlo runs. These tables show that the 
numerical and simulation results match very closely. 



TABLE III: Numerical and Simulation results for P{A = i;Gi) for control of FDR when q = 



Count 


12 3 4 


P(A;Gi) (Numerical) 


0.7777 0.1777 0.0407 0.0061 3.1544 x 10"" 


P{A;Gi) (Simulations) 


0.7756 0.1773 0.0401 0.0065 5.1000 x 10"'' 
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TABLE IV: Numerical and Simulation results for P{A = i; Gi) for control of FDR when a = 0.5 



Count 


12 3 4 


P(A;Gi) (Numerical) 


0.8355 0.1347 0.0227 0.0027 1.8121 x 10"" 


P(A;Gi) (Simulations) 


0.8383 0.1353 0.0233 0.0028 1.7000 x 10"" 



TABLE V: Numerical and Simulation results for P(A = i;Gi) for control of FDR when a = 1 



Count 


12 3 4 


P(A;Gi) (Numerical) 


0.9022 0.0789 0.0106 0.0012 6.2896 x 10"^ 


P(A;Gi) (Simulations) 


0.9084 0.0793 0.0111 0.0011 6.4000 x 10"^ 



In Figs. |5] |6] and |7j we provide the simulation results and the analytical approximation for the p.m.f 
of the count statistic P(A;Gi) for a large number of sensors in the ROI. The simulation parameters 
are N = 500, Pq = 15, R = 10, do = 3 and FDR parameter 7 = 0.0077. The simulation results are 
for 5 X 10^ Monte Carlo runs. The simulation results demonstrate that the asymptotic expressions for 
P(A; Gi) match the simulation results very well. 

0.08 
0.07 
0.06 
0.05 
5 0.04 

Q. 

0.03 
0.02 
0.01 


10° 10' 10^ 10' 

Count Statistic 

Fig. 5: Simulated and analytical results for P(A; Gi) for FDR based scheme under asymptotic conditions 
when a = 0.1 

Fig. [8] shows the reduction in the detection performance with the increase in number of Byzantines in 
the network. The simulation parameters are: R = 10, do = 5, Pq = 5, N = 20, system-wide probability 




Simulated 

Analytical Expression 
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Fig. 6: Simulated and analytical results for P(A; Gi) for FDR based scheme under asymptotic conditions 
when a = 0.4 
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Fig. 7: Simulated and analytical results for P(A; Gi) for FDR based scheme under asymptotic conditions 
when a = 0.8 



of false alarm is fixed at PpA = 0.1. This yields optimal FDR parameter as 7 = 0.25. The simulation 
results are for 1 X lO'' Monte-Carlo runs. This shows that the Byzantines reduce the power of the test 
(detection probability) even though the FDR value is maintained at the prescribed threshold. This leads 
to the interesting problem of obtaining the optimal transformation h{-) for the Byzantines which leads 
to the worst detection performance at the FC. This analysis is much more complex and will be explored 
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in the future. 
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Fig. 8: Probability of detection against the fraction of Byzantines when the true global hypothesis is Gi 
for PpA = 0.1 



In this section, we first demonstrate that the optimal design parameter for FDR based scheme depends 
on the fraction of Byzantines (a). If prior information is available regarding a, we can design our system 
such that the performance is optimized for the given value of a. However, in a dynamically changing 
environment, it becomes important that we learn this fraction (a) over time and change our system design 
parameters adaptively. In this section, we propose an adaptive algorithm which learns the maliciousness 
of network over time and changes the design parameters to improve the system's detection performance 
in a dynamic manner. In other words, we learn the effect of the Byzantines on the network and mitigate 
their effect by adaptively changing the system parameters. 

A. Optimal parameter design 

In this subsection, we first give design guidelines for FDR based distributed detection in the presence 
of Byzantines. For a fixed system-wide probability of false alarm given by 



where T is the global threshold for the count statistic used at the FC and k is the randomization parameter. 
The optimal local threshold parameter (7 for FDR-based scheme) is found by maximizing the system-wide 



VI. Adaptive FDR based distributed detection 



Pfa = > T- Go) + A^P(A = T; Go) 



(24) 
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probability of detection (Pd)- Pd is given by 

Pd = P(A > T; Gi) + kP(A = T; d) 



(25) 



where the p.m.f of the count statistic is given by Propositions 5. 1 and 5.2 As can be seen from Proposition 



5.1 the distribution of count statistic under Go does not depend on a and, therefore, the expression of 
PpA remains the same irrespective of the value of a. In this section, we show through simulations that 
the optimal parameter for FDR-based approach varies with a. Intuitively, the Byzantines decrease the 
distance (such as KL divergence or KS distance) between the pmfs of the count under global hypotheses 
Go and Gi. It is, therefore, important to re-optimize the local threshold parameters to increase the distance 
between these pmfs as much as possible for this fixed value of a. 

For system and target parameters given by = 20, i? = 10, -Po = 5, do = 5 and PpA = 0.1, we have 



shown in Section III that the optimal local threshold parameter is 7 = 0.25 for the FDR based scheme. 
For the FDR based algorithm, the optimal parameter 7 changes with a from = 0.25 to ^ugh = 0.1 
as shown in Fig. |9] The simulations were performed for multiple values of a ranging between and 1 
but the figures have been omitted for the sake of brevity. 



P vsv 



0.9 
0.8 
0.7 

g 

03 

■o 

\ 0.5 

I 0.4 
o 

0.3 
0.2 
0.1 








0.5 



0.2 0.3 

1 

Fig. 9: Probability of detection versus FDR local parameter 7 when PpA = 0.1 and for varying a 



From these extensive simulations it was observed that the optimal parameter remains nearly constant 
for different intervals of a. For a < 0.2, ^opt = How = 0.25 and for a > 0.2, jopt = Ihigh = 0.1. Using 
this information, we can re-simulate Fig. [8] using the optimal parameters. Fig. 10 shows the improvement 
in detection performance of FDR-based scheme using adaptive optimal parameters. 

Since we have shown that the optimal parameter value depends on the fraction of Byzantines present 
in the network, it becomes important to learn this parameter to adaptively re-design the system using 
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Fig. 10: Probability of detection against the fraction of Byzantines when the true hypothesis is Hi using 
optimal parameters 



the parameters at hand (local thresholds). We use a modified Kolmogorov-Smirnov Test, proposed in the 
following sub-section, to learn the fraction of Byzantines present in the network. 

B. Modified Kolmogorov-Smirnov Test 



Kolmogorov-Smimov (K-S) test |37| is a goodness-of-fit test which compares a sample observed 
data with a reference null hypothesis distribution. It quantifies a distance metric (Kolmogorov-Smirnov 
distance) between the sample empirical c.d.f and the null hypothesis c.d.f to decide the goodness-of-fit. 
It is typically used only for continuous distribution but Conover | [38| has extended this to cover the case 
of discontinuous distributions. Since, the only information known by the FC at every time instant is the 
count statistic (A), a goodness-of-fit test on the count statistic has to be used to decide the range of a. 

Description of the test [38]: Let Xi, X2,- ■ ■ , Xn represent a random sample of size n. Denote the 
null hypothesis by 

Ho : F{x) = H{x) for all x, (26) 

where F{x) is the unknown population distribution function, and H{x) is the hypothesized distribution 
function with all parameters specified. H{x) may be continuous, discrete, or a mixture of the two types. 
Let Sn{x) represent the empirical distribution function, 

Sn{x) = —(the number of Xj's which are < x), for all x. (27) 

n 
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The algorithm proposed by Conover |38| gives a critical value of the sample which quantifies the 



confidence of null hypothesis being true. The test statistic D is defined as 

D = snv\H{x) - Sn{x)\ (28) 

X 

In our case, it is a binary hypothesis test and we would like to compare the sample with the distribution 
of the count statistic under low a regime and high a regime. 

i^o:^'(A;Gi) for a = a^^^ (29) 
Ki : P(A; d) for a = augh (30) 



Here, we modify the algorithm provided by Conover |38 1 to generate test statistics Di for each of the 
hypotheses {Ki). Since we have the distribution of the count statistic for both the hypotheses (using the 
analytical expressions derived in Section V-A[ ), we can find the test statistics under both the hypotheses 



using the given sample data. We then decide the hypothesis Hi for which the Di is larger. The advantage 
of using K-S test is that it performs well even for relatively small number of samples (e.g., 20-30 samples) 
compared to Pearson's Chi-Square test | |39) which requires a larger number of samples. 

C. Adaptive Algorithm 

In this subsection, we propose an adaptive algorithm based on the modified K-S test described above. 
In this algorithm, at every time instant t, the FC stores the count value of the previous To time instants for 
which the global decision of G\ was made. Using these Tg data samples, it makes a decision regarding 



the region in which a lies using the modified K-S test described in Section VI-B Depending on the 
decision made, it changes the detector parameters. Let Ki denote the decision made using the modified 
K-S test, then the detector parameters are changed as 

{-^0:7 = llow,T = Tlow, K = Klow 
(31) 
Kl : ^ = Jhigh, T = Thigh, K = Khigh 

The global threshold parameters (T, k) also need to be changed in order to maintain the system-wide 
false alarm probability PpA at the desired value. 

We now provide simulation results of our proposed adaptive algorithm. The system and target param- 
eters are: N = 20, R = 10, Pq = 5, do = 5 and PpA = 0.1. This gives us the optimal FDR parameters 
as 'Yiow = 0.25 and ^high = 0.1. For the K-S hypothesis test, we have used Tq = 30 samples and the 
distributions for aiow = and Ohigh = 0.5 under the two hypotheses have been found using ( |2T] ) given 
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in Section V-A The system is initially Byzantine free, i.e., a = 0. At t = 30, a changes to 0.7. In Fig. 
[TT| the global detection probability is plotted against time for the proposed adaptive algorithm and a 
non-adaptive algorithm which continues to use the initial detector parameters. As can be observed, the 
detection performance improves when the adaptive algorithm is used. 
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Fig. 11: Probability of detection versus time when a changes from to 0.7 at t = 30 



VII. Conclusion & Future Work 

In this paper, we explored the problem of FDR based distributed detection in the presence of Byzantines. 
Building on the work of |[T|, we showed that system performance can be improved in the non-asymptotic 
cases by the use of Kolmogorov-Smirnov distance as the system design metric instead of deflection 
coefficient. We explored the system in the presence of Byzantines and showed that the global network 
performance degrades in the presence of Byzantines although the FDR value is still controlled at the pre- 
determined threshold (7). We analyzed the system performance both theoretically and numerically and 
proposed an adaptive approach to improve the performance which degraded in the presence of Byzantines. 
The proposed scheme learns the fraction of Byzantines present in the network and adaptively changes 
the system parameters to improve the global detection performance. 

There are several directions for future work on this problem. One could explore other distance measures 



1 34 1 which characterize the system performance and can be used for system design. The optimal attack 
strategy for the Byzantines needs to be derived as it would be interesting to see how the performance 
of the network depends on the optimal attack strategy of the Byzantines defined by /lopt(-)- Here, we 
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considered the Neyman-Pearson framework. This work could be extended to a Bayesian framework where 
the problem is to detect the presence of a random target. For this, one may need to use the Bayesian 
version of FDR called Bayesian FDR ||40l or pFDR ||4T|. In pOl, the rule for control of Bayesian FDR 
has been proposed which can be used to design local sensor thresholds in distributed detection under 
Bayesian framework similar to the present work. 

Appendix A 
Proof of Proposition s. 21 

The pdf of a p-value n of a true sensor observation located at a radial distance r from the target is 
given by 

f{u;r) = exp(-^)exp(,/.Q-Hn)),0 <u<l (32) 

where = if r > do and <p - \/P6 it r < do. The marginal pdf of the p- values in the presence of 
target can be found as 

do rR 



fuiu)= I exp(-^)exp(/F^Q-i(n))/ij(r)dr+ / fR{r)dr (33) 

Jo ^ Jdo 

4exp(-^)exp(yf^Q-i(n)) + ( 1 " ^ ) (34) 



where f R{r) = for < r < R has been used. 

This gives the marginal pdf of the reported p-values v as 



fviv) 



(35) 



fu{v) if i is an honest sensor 

fu{l — v) if i is a Byzantine sensor 
Under the assumption of independent p-values of the sensor observations, the reported p-values v are 
also independent. The FDR control algorithm requires ordering of these reported p-values denoted by 
vi,N < V2,N • • • < vn,n and are correlated (due to the ordering) with joint pdf given by 

M,nV2,n-Vn.n = Nlfv,{vi)fvAv2) ■ ■ ■ fVr,{vN) < Vi^N < V2,N < ■ ■ < Vn,N (36) 

where the marginal density fv,{vi) is given by ([35]) 

These ordered reported p-values are compared against linearly decreasing thresholds to get the number 
of detections and, therefore, the probability can be found as 

"1 /■fi+2,N ri-y/N 



P(A = i;Gi)=/ •••/ / 

Jvnn=^ Jvi^^ M=(((i+1)/N)f) Jvi 



'vN,N='y Jvi + i^N = {{{i+l)/N)f) Jt)i_jv=0 

N\fv,{vi)fv2{v2) ■ ■ ■ fvN{'^N)dvi^Ndv2,N ■ ■ ■ dVN,N (37) 

"i,jv=0 
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However, since any of the M sensors can be Byzantines, we need to take an average over the (^) 
possibiUties which gives the desired result. 

For a large number of sensors, we can derive the approximate distribution of the count statistic A 
under Gi using the result by Genovese and Wasserman which states that asymptotically the Benjamini- 
Hochberg method corresponds to classifying as Hi all p-values that are less than a particular threshold 
V* , where v* is the solution to the equation 

F{v) = Pv (38) 

and 

Here F{v) is the c.d.f of the reported p-values under Hi and is assumed to be strictly concave, and Aq is 
the fraction of true Hqs. This threshold v* is found by assuming a mixture model of the distribution of p- 
values. For honest sensors, Fh{v) = Q{Q^^{v) — 0) and for Byzantine sensors Fb{v) = 1 — Q{Q~^{1 — 
v) — (f)), where (p = ^ (-Pq)- So, under the mixture model, we have F{y) = aFsiv) + (1 — a)FH{v). For 
a large ROI, on an average (i^/ F? fractions of sensors receive the signal, and therefore = 1 — (io/^^- 
Hence, the average probability of detection of an honest sensor is given by 

= (1 - Aq)P{V < v*\Hi) + AoP{V < v*\Ho) (40) 



= (1 - ^o) / U{u)du + Aov* (41) 

^0 

where /^(m) is given by ([15). 

Similarly, the probability of detection of a Byzantine sensor is given by 

= (1 - A^)P{V < v*\Hi) + AoP{V < v*\Ho) (42) 
= (1 - Ao)P{l -U < v*\Hi) + AoP{l -U < v*\Ho) (43) 

= {l-Ao) [' Uiu)du + Aov* (44) 
The probability of A = i detections (count) when the target is present is provided by 

mm(M,i) /at i\ f\ 

P{A = ^■ Gi) = J2 (k) { i I k) ^^"^^'^^ - PD)''-'(pDy-Hl - 

(45) 



A:=max {0,M-N+i) 



or 



P(A = z; Gi) = {^] {pDYil - PDf-' (46) 
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where pd is the average probability of detection of a sensor given by 



PD = Q-Pd + (1 - Oi)PD 



(47) 



Also, we can further approximate this expression using DeMoivre-Laplace theorem as a Gaussian 
distribution 
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